Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

CNN quantization and compression strategy for edge computing applications

CAI Ruichu, ZHONG Chunrong, YU Yang, CHEN Bingfeng, LU Ye, CHEN Yao

Journal of Computer Applications 2018, 38 (9): 2449-2454. DOI: 10.11772/j.issn.1001-9081.2018020477

Abstract （1817）

PDF （944KB）（1110）

Save

Focused on the problem that the memory and computational resource intensive nature of Convolutional Neural Network (CNN) limits the adoption of CNN on embedded devices such as edge computing, a convolutional neural network compression method combining network weight pruning and data quantization for embedded hardware platform data types was proposed. Firstly, according to the weights distribution of each layer of the original CNN, a threshold based pruning method was illustrated to eliminate the weights that have less impact on the network processing accuracy. The redundant information in the network model was removed while the important connections were preserved. Secondly, the required bit-width of the weights and activation functions were analyzed based on the computational characteristics of the embedded platform, and the dynamic fixed-point quantization method was employed to reduce the bit-width of the network model. Finally, the network was fine-tuned to further compress the model size and reduce the computational consumption while ensuring the accuracy of model inference. The experimental results show that this method reduces the network storage space of VGG-19 by over 22 times while reducing the accuracy by only 0.3%, which achieves almost lossless compression. Meanwhile, by evaluating on multiple models, this method can reduce the storage space of the network model by a maximum of 25 times within the range of average accuracy lose of 1.46%, which proves the effective compression of the proposed method.

Reference | Related Articles | Metrics